Goto

Collaborating Authors

 correction data


Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections

Neural Information Processing Systems

We address key challenges in Dataset Aggregation (DAgger) for real-world contactrich manipulation: how to collect informative human correction data and how to effectively update policies with this new data. We introduce Compliant Residual DAgger (CR-DAgger), which contains two novel components: 1) a Compliant Intervention Interface that leverages compliance control, allowing humans to provide gentle, accurate delta action corrections without interrupting the ongoing robot policy execution; and 2) a Compliant Residual Policy formulation that learns from human corrections while incorporating force feedback and force control. Our system significantly enhances performance on precise contact-rich manipulation tasks using minimal correction data, improving base policy success rates by over 60% on two challenging tasks (book flipping and belt assembly) while outperforming both retraining-from-scratch and finetuning approaches. Through extensive real-world experiments, we provide practical guidance for implementing effective DAgger in real-world robot learning tasks.


Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections

Neural Information Processing Systems

We address key challenges in Dataset Aggregation (DAgger) for real-world contact-rich manipulation: how to collect informative human correction data and how to effectively update policies with this new data. We introduce Compliant Residual DAgger (CR-DAgger), which contains two novel components: 1) a Compliant Intervention Interface that leverages compliance control, allowing humans to provide gentle, accurate delta action corrections without interrupting the ongoing robot policy execution; and 2) a Compliant Residual Policy formulation that learns from human corrections while incorporating force feedback and force control. Our system significantly enhances performance on precise contact-rich manipulation tasks using minimal correction data, improving base policy success rates by over 60% on two challenging tasks (book flipping and belt assembly) while outperforming both retraining-from-scratch and finetuning approaches. Through extensive real-world experiments, we provide practical guidance for implementing effective DAgger in real-world robot learning tasks.


SegQC: a segmentation network-based framework for multi-metric segmentation quality control and segmentation error detection in volumetric medical images

arXiv.org Artificial Intelligence

Quality control of structures segmentation in volumetric medical images is important for identifying segmentation errors in clinical practice and for facilitating model development by enhancing network performance in semi-supervised and active learning setups. This paper introduces SegQC, a novel framework for segmentation quality estimation and segmentation error detection. SegQC computes an estimate measure of the quality of a segmentation in volumetric scans and in their individual slices and identifies possible segmentation error regions within a slice. The key components include: 1) SegQC-Net, a deep network that inputs a scan and its segmentation mask and outputs segmentation error probabilities for each voxel in the scan; 2) three new segmentation quality metrics, two overlap metrics and a structure size metric, computed from the segmentation error probabilities; 3) a new method for detecting possible segmentation errors in scan slices computed from the segmentation error probabilities. We introduce a new evaluation scheme to measure segmentation error discrepancies based on an expert radiologist corrections of automatically produced segmentations that yields smaller observer variability and is closer to actual segmentation errors. We demonstrate SegQC on three fetal structures in 198 fetal MRI scans - fetal brain, fetal body and the placenta. To assess the benefits of SegQC, we compare it to the unsupervised Test Time Augmentation (TTA)-based quality estimation. Our studies indicate that SegQC outperforms TTA-based quality estimation in terms of Pearson correlation and MAE for fetal body and fetal brain structures segmentation. Our segmentation error detection method achieved recall and precision rates of 0.77 and 0.48 for fetal body, and 0.74 and 0.55 for fetal brain segmentation error detection respectively. SegQC enhances segmentation metrics estimation for whole scans and individual slices, as well as provides error regions detection. Introduction The segmentation of structures in volumetric medical images is increasingly used in clinical practice for a variety of diagnostic and prognostic tasks. Since manual delineation of structures' contours is time-consuming and requires expertise, a variety of automatic segmentation methods have been developed.


Learning From Mistakes Makes LLM Better Reasoner

arXiv.org Artificial Intelligence

Large language models (LLMs) recently exhibited remarkable reasoning capabilities on solving math problems. To further improve their reasoning capabilities, this work explores whether LLMs can LEarn from MistAkes (LEMA), akin to the human learning process. Consider a human student who failed to solve a math problem, he will learn from what mistake he has made and how to correct it. Mimicking this error-driven learning process, LEMA incorporates mistake-correction data pairs during fine-tuning LLMs. Specifically, we first collect inaccurate reasoning paths from various LLMs, and then employ GPT-4 as a "corrector" to identify the mistake step, explain the reason for the mistake, correct the mistake and generate the final answer. In addition, we apply a correction-centric evolution strategy that effectively expands the question set for generating correction data. Experiments across various LLMs and reasoning tasks show that \textsc{LeMa} consistently improves CoT-alone fine-tuning. Our further analysis sheds light on the non-homogeneous effectiveness between CoT data and correction data, and the contribution from different correction information. These results suggest a significant potential for LLMs to improve through learning from their mistakes. Our code and models are publicly available at https://github.com/microsoft/LEMA.


Accuracy evaluation of a Low-Cost Differential Global Positioning System for mobile robotics

arXiv.org Artificial Intelligence

Differential GPS, commonly referred as DGPS, is a well-known and very accurate localization system for many outdoor applications in particular for mobile outdoor robotics. The most common drawback of DGPS systems are the high costs for both base station and receivers. In this paper, we present a setup that uses third-party open-source software and a Ublox ZED-F9P chip to build a ROS-enabled low-cost DGPS setup that is ready to use in a few hours. The main goal of this paper is to analyze and evaluate the repetitive and absolute accuracy of the system. The first measurement also examines the differences between a SAPOS base station and a locally installed one consisting of low-cost components. During the evaluation process of the absolute accuracy, a moving mobile robot is used on the receiver side. It is tracked through a highly accurate VICON motion capture system.